29 research outputs found
Do Language Embeddings Capture Scales?
Pretrained Language Models (LMs) have been shown to possess significant
linguistic, common sense, and factual knowledge. One form of knowledge that has
not been studied yet in this context is information about the scalar magnitudes
of objects. We show that pretrained language models capture a significant
amount of this information but are short of the capability required for general
common-sense reasoning. We identify contextual information in pre-training and
numeracy as two key factors affecting their performance and show that a simple
method of canonicalizing numbers can have a significant effect on the results.Comment: Accepted at EMNLP Findings 2020 and EMNLP BlackboxNLP workshop 2020;
8 pages, 2 figures; Minor changes to the acknowledgment sectio
Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs
Training data attribution (TDA) methods offer to trace a model's prediction
on any given example back to specific influential training examples. Existing
approaches do so by assigning a scalar influence score to each training
example, under a simplifying assumption that influence is additive. But in
reality, we observe that training examples interact in highly non-additive ways
due to factors such as inter-example redundancy, training order, and curriculum
learning effects.
To study such interactions, we propose Simfluence, a new paradigm for TDA
where the goal is not to produce a single influence score per example, but
instead a training run simulator: the user asks, ``If my model had trained on
example , then , ..., then , how would it behave on
?''; the simulator should then output a simulated training run, which
is a time series predicting the loss on at every step of the
simulated run. This enables users to answer counterfactual questions about what
their model would have learned under different training curricula, and to
directly see where in training that learning would occur.
We present a simulator, Simfluence-Linear, that captures non-additive
interactions and is often able to predict the spiky trajectory of individual
example losses with surprising fidelity. Furthermore, we show that existing TDA
methods such as TracIn and influence functions can be viewed as special cases
of Simfluence-Linear. This enables us to directly compare methods in terms of
their simulation accuracy, subsuming several prior TDA approaches to
evaluation. In experiments on large language model (LLM) fine-tuning, we show
that our method predicts loss trajectories with much higher accuracy than
existing TDA methods (doubling Spearman's correlation and reducing mean-squared
error by 75%) across several tasks, models, and training methods